Making data that moves

There is a package called gganimate that … you guessed it … animates your ggplot work. This is useful if you have many years of data or data that changes in some meaningful way.

Let’s compare Nebraska’s defense to Alabama’s.

Go to the console and type install.packages("gganimate")

library(tidyverse)
## ── Attaching packages ──────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.1.0     ✔ purrr   0.2.5
## ✔ tibble  2.0.1     ✔ dplyr   0.7.8
## ✔ tidyr   0.8.2     ✔ stringr 1.3.1
## ✔ readr   1.3.1     ✔ forcats 0.3.0
## Warning: package 'tibble' was built under R version 3.5.2
## ── Conflicts ─────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(gganimate)
## Warning: package 'gganimate' was built under R version 3.5.2
library(ggrepel)
scoringdefense <- read_csv("~/Dropbox/SPMC350-Data-Literacy-and-Analytics-in-Sports/Data/ScoringDefense.csv")
## Parsed with column specification:
## cols(
##   Year = col_double(),
##   Name = col_character(),
##   G = col_double(),
##   TD = col_double(),
##   FG = col_double(),
##   `1XP` = col_double(),
##   `2XP` = col_double(),
##   Safety = col_double(),
##   Points = col_double(),
##   `Points/G` = col_double()
## )
defensivethirddowns <- read_csv("~/Dropbox/SPMC350-Data-Literacy-and-Analytics-in-Sports/Data/OpponentThirdDown.csv")
## Parsed with column specification:
## cols(
##   Year = col_double(),
##   Name = col_character(),
##   G = col_double(),
##   Attempts = col_double(),
##   Conversions = col_double(),
##   `Conversion %` = col_double()
## )
defensethird <- left_join(defensivethirddowns, scoringdefense, by=c('Year', 'Name'))
nudefthird <- defensethird %>% filter(Name == "Nebraska")
aldefthird <- defensethird %>% filter(Name == "Alabama")

The important bits below:

  1. You can make dynamic labels. See the labs directive.
  2. The rest is simple. Just add your frame variable – which field you are using to separate this – into the transition_time function.
ggplot(data=defensethird, aes(x=`Conversion %`, y=`Points/G`)) + 
  geom_point(color="grey") + geom_smooth(method=lm, se=FALSE) + 
  geom_point(data=nudefthird, aes(x=`Conversion %`, y=`Points/G`), color="red") + 
  geom_point(data=aldefthird, aes(x=`Conversion %`, y=`Points/G`), color="black") + 
  geom_text(data=nudefthird, aes(x=`Conversion %`, y=`Points/G`, label=Name)) + 
  geom_text(data=aldefthird, aes(x=`Conversion %`, y=`Points/G`, label=Name, frame = Year)) +
  labs(title = 'Two teams going in different directions: Alabama vs Nebraska {frame_time}', x = 'Opponent third down', y = 'Points per game') +
  transition_time(Year) +
  enter_fade() + 
  exit_shrink() +
  ease_aes('sine-in-out')
## Warning: Ignoring unknown aesthetics: frame

Assignment

Just how out of whack was Nebraska’s penalty yards per game last season midway through? Let’s animate it and find out. I’ll give you some code to get you started. What you have here is every football team in the FBS since 2009, how many penalty yards per game they run up and how many points per game they score, all in a dataframe called joined. You should be able to just run this and it will work: No downloading required.

penalties <- read_csv("https://raw.githubusercontent.com/mattwaite/SPMC350-Sports-Data-Analysis-And-Visualization/master/Data/penalties.csv")
## Parsed with column specification:
## cols(
##   Year = col_double(),
##   Name = col_character(),
##   G = col_double(),
##   Pen. = col_double(),
##   Yards = col_double(),
##   `Pen./G` = col_double(),
##   `Yards/G` = col_double()
## )
offense <- read_csv("https://raw.githubusercontent.com/mattwaite/SPMC350-Sports-Data-Analysis-And-Visualization/master/Data/ScoringOffense.csv")
## Parsed with column specification:
## cols(
##   Year = col_double(),
##   Name = col_character(),
##   G = col_double(),
##   TD = col_double(),
##   FG = col_double(),
##   `1XP` = col_double(),
##   `2XP` = col_double(),
##   Safety = col_double(),
##   Points = col_double(),
##   `Points/G` = col_double()
## )
joined <- penalties %>% left_join(offense, by=c("Year", "Name"))

So what you need to do is filter out Nebraska into a dataframe, and then using the code above, modify it so you can see where Nebraska was over the years when it comes to penalty yards and scoring. You’ll be adjusting the dataframes used, at least one field name, and the labels.

Rubric

  1. Did you create an animated scatterplot?
  2. Is Nebraska marked on it with the rest of the FBS?
  3. Is Nebraska labled?
  4. Did you comment your code in Markdown?